Impacts of Climate Change on Mortality: An extrapolation of temperature effects based on time series data in France
This R Markdown Notebook reproduces the numerical applications presented in the paper “Impacts of Climate Change on Mortality: An extrapolation of temperature effects based on time series data in France”, available here.
The objective of this Notebook is to illustrate the estimation techniques and the presentation of results described in this article. Since daily mortality data is not publicly available, the estimates obtained from the DLNM model are presented, but cannot be reproduced. However, we present the functions to be applied, enabling the reader to replicate the process with their own data.
For the sake of brevity and code clarity, we limit the presentation to the aggregated results in Metropolitan France only. We also consider only the scenarios related to one climate model instead of the 12 presented in this paper. This simplification does not limit the understanding of the code or the reproducibility of the results.
Note that we use:
- an adjusted version of the
MultiMoMo package. Some functions
have been re-coded or added to better suit our application.
- the dlnm package and an adjusted version of the attrdl function
initially developed by Antonio Gasparrini.
1 Set-up
First, we set up the working environment and load the necessary functions and packages for the application.
# Folders
fold <- getwd()
fold_data <- paste0(fold, "/data/")
fold_bib <- paste0(fold, "/functions/")# Load functions
for (f in list.files(fold_bib))
source(paste(fold_bib,f,sep=""), encoding = "UTF-8")
#----------------------------------------------------------
# Load packages
invisible(lapply(c(
"MultiMoMo",
"HMDHFDplus",
"systemfit",
"lattice",
"grid",
"ggplot2",
"gridExtra",
"locfit",
"scales",
"lubridate",
"splines",
"mgcv",
"dlnm",
"MASS",
"mvtnorm",
"Rfast",
"parallel",
"data.table",
"formattable",
"RColorBrewer",
"readxl",
"kableExtra",
"dplyr",
"tidyr",
"tidyverse",
"rmarkdown",
"ggthemes",
"cowplot",
"ggridges",
"viridis",
"hrbrthemes",
"colorspace",
"ggbeeswarm",
"ggpubr"),
instal.import.package))
options(dplyr.summarise.inform = FALSE)
#----------------------------------------------------------
# Graphical parameters
theme_set(theme_bw())
trellis.device(color = FALSE)# Forecasting period
start_year <- 2020
end_year <- 2100
year_breaks <- c(1980, 1989, 1999, 2009, 2019, 2029, 2039, 2049, 2059,
2069, 2079, 2089, 2100)
year_labels <- c("1980s", "1990s", "2000s", "2010s", "2020s", "2030s",
"2040s", "2050s", "2060s", "2070s", "2080s","2090s")
# Heatwave parameter - french definition
min_level <- 18
max_level <- 30
# Monte-Carlo simulations
nsim <- 1000 # number of simulations
parallel <- "no" # parallel option
ncpus <- 1L # number of cores
# Other parameters
AGE_MAX <- 1052 Load data
In this section, we start by loading the input data. Three types of information are considered: historical mortality data (including daily temperatures and daily deaths), annual mortality data, and climate scenarios. More information regarding the data description is available in the paper.
2.1 Daily Mortality data
We import daily data series for death counts (which are not freely available) and temperatures. This dataset is constructed by merging:
- daily mortality data,
- daily temperature data.
daily_data <- get(load(file = paste0(fold_data, "/daily_data.RData")))
# Summary of the data - list with 2 items for females (`f`) and males (`m`)
head(daily_data)## $m
## datedec age_bk nb_deaths years tmax tmax3 tmin
## <Date> <char> <num> <int> <num> <num> <num>
## 1: 1980-01-01 0-64 265 1980 5.842857 6.464286 0.22142857
## 2: 1980-01-01 65-74 190 1980 5.842857 6.464286 0.22142857
## 3: 1980-01-01 75-84 246 1980 5.842857 6.464286 0.22142857
## 4: 1980-01-01 85+ 125 1980 5.842857 6.464286 0.22142857
## 5: 1980-01-02 0-64 276 1980 3.742857 5.173810 -0.95000000
## ---
## 58436: 2019-12-30 85+ 324 2019 8.085714 8.195238 0.02142857
## 58437: 2019-12-31 0-64 145 2019 7.285714 7.278571 0.22142857
## 58438: 2019-12-31 65-74 175 2019 7.285714 7.278571 0.22142857
## 58439: 2019-12-31 75-84 215 2019 7.285714 7.278571 0.22142857
## 58440: 2019-12-31 85+ 313 2019 7.285714 7.278571 0.22142857
## tmin3 tavg tavg3 time dow month
## <num> <num> <num> <num> <int> <int>
## 1: 1.8190476 2.289286 3.801190 1 3 1
## 2: 1.8190476 2.289286 3.801190 1 3 1
## 3: 1.8190476 2.289286 3.801190 1 3 1
## 4: 1.8190476 2.289286 3.801190 1 3 1
## 5: 0.2261905 1.550000 2.483333 2 4 1
## ---
## 58436: 2.0380952 3.271429 4.640476 14609 2 12
## 58437: 0.3928571 3.464286 3.378571 14610 3 12
## 58438: 0.3928571 3.464286 3.378571 14610 3 12
## 58439: 0.3928571 3.464286 3.378571 14610 3 12
## 58440: 0.3928571 3.464286 3.378571 14610 3 12
##
## $f
## datedec age_bk nb_deaths years tmax tmax3 tmin
## <Date> <char> <num> <int> <num> <num> <num>
## 1: 1980-01-01 0-64 122 1980 5.842857 6.464286 0.22142857
## 2: 1980-01-01 65-74 117 1980 5.842857 6.464286 0.22142857
## 3: 1980-01-01 75-84 280 1980 5.842857 6.464286 0.22142857
## 4: 1980-01-01 85+ 271 1980 5.842857 6.464286 0.22142857
## 5: 1980-01-02 0-64 116 1980 3.742857 5.173810 -0.95000000
## ---
## 58436: 2019-12-30 85+ 552 2019 8.085714 8.195238 0.02142857
## 58437: 2019-12-31 0-64 107 2019 7.285714 7.278571 0.22142857
## 58438: 2019-12-31 65-74 98 2019 7.285714 7.278571 0.22142857
## 58439: 2019-12-31 75-84 181 2019 7.285714 7.278571 0.22142857
## 58440: 2019-12-31 85+ 514 2019 7.285714 7.278571 0.22142857
## tmin3 tavg tavg3 time dow month
## <num> <num> <num> <num> <int> <int>
## 1: 1.8190476 2.289286 3.801190 1 3 1
## 2: 1.8190476 2.289286 3.801190 1 3 1
## 3: 1.8190476 2.289286 3.801190 1 3 1
## 4: 1.8190476 2.289286 3.801190 1 3 1
## 5: 0.2261905 1.550000 2.483333 2 4 1
## ---
## 58436: 2.0380952 3.271429 4.640476 14609 2 12
## 58437: 0.3928571 3.464286 3.378571 14610 3 12
## 58438: 0.3928571 3.464286 3.378571 14610 3 12
## 58439: 0.3928571 3.464286 3.378571 14610 3 12
## 58440: 0.3928571 3.464286 3.378571 14610 3 12
# Get quantiles for extreme hot and cold of average daily temperature
q025 <- quantile(daily_data$m$tavg, 0.025)
q975 <- quantile(daily_data$m$tavg, 0.975)The object daily_data is a list containing two datasets for females (f) and
males (m). Each dataset contains:
datedec: the date,age_bk: age buckets,nb_deaths: number of deaths,years: years,tmax: maximum temperature of the daytmax3: maximum temperature over the last 3 daystmin: minimum temperature over the last 3 daystmin3: minimum temperature over the last 3 daystavg: average temperature of the daytavg3: average temperature over the last 3 daystime: day number as a integer numberdow: day of the week as a integer numbermonth: month number
Our DLNM model for nb_deaths is fitted based on the tavg variable, as
well as datedec, dow and month.
2.2 Annual mortality data from the HMD
The annual mortality data considered are extracted from the Human Mortality Database . We downloaded the exposure to risk and death counts from the HMD website. We consider data from Metropolitan France for the calibration period \(\mathcal{T}_y = \left\lbrace 1980, \ldots, 2019 \right\rbrace\) and the age range \(\mathcal{X}\)\(= \left\lbrace 0, \ldots, 105 \right\rbrace\).
2.3 Data from climate models
We import daily data series for climate trajectories. The models selected in the paper are the 12 models available through the portal for RCP2.6, RCP4.5, and RCP8.5 scenarios. In this Notebook, we only consider the model CNRM-CM5 / ALADIN63 with the RCP8.5 scenario.
3 Estimating the DLNM model
The distributed lag non-linear generalized model (DLNM) is a reference standard in climate epidemiology for assessing the impact of delayed environmental factors on a response variable, see e.g. A. Gasparrini, Armstrong, and Kenward (2010) or Antonio Gasparrini (2014).
We fit DLNM models for each sex \(g\) and each age bucket \(k\). This model aims to assess the influence of temperatures on the daily number of deaths.
Let \(\lambda_{k,t,d}^{(g)} = \mathbb{E}\left({D_{k,t,d}^{(g)}}\right)\) represents the expected number of deaths for day \(d\), year \(t\) and sub-group \(k\). \(\vartheta_{d,t}\) denotes the daily average temperature of day \(d\) for year \(t\). The number of daily deaths \(D_{k,t,d}^{(g)}\), aggregated by \(K\) age groups and by sex \(g\), for each day \(d \in \mathcal{D}^\star = \left\lbrace 1,2,\ldots, 365, (366)\right\rbrace\) of year \(t\) is modeled by \[ \ln(\lambda_{k,t,d}^{(g)})=\eta_k^{(g)} + s(\vartheta_{d,t}, L; \boldsymbol{\theta}_k^{(g)} ) + \sum_{p=1}^{P}{h_p( z_{d,p} ; \boldsymbol{\zeta}_{k,p}^{(g)})}, \] where:
- \(s (\vartheta_{d,t} , L ; \boldsymbol{\theta}_{k}^{(g)}) = \sum_{l=0}^{L}{f\cdot w(\vartheta_{d-l,t},l;\boldsymbol{\theta}_k^{(g)})}\) is a cross-basis non-linear function, capturing the cumulated effect of the daily average temperature \(\vartheta_{d,t}\) over a maximum of \(L\) days where \(f\cdot w \left( \cdot, \cdot \right)\) is a bi-dimensional exposure–lag–response function,
- \(h_p( z_{d,p} ; \boldsymbol{\zeta}_{k,p}^{(g)})\) are smooth univariate functions of categorical time variables \(z_{d,p}\), e.g. day, day of the week, month or year. They capture residual seasonal effects, demographic shifts, or other long-term trends not accounted for by the other covariates.
We consider a bi-dimensional spline function \(s(\cdot, \cdot)\) as the surface of Relative Risk (RR). All DLNM models are calibrated based on daily average temperature data over the period 1980-2019, using the dlnm package (A. Gasparrini 2011). We consider for the exposure–response curve a natural cubic B-spline with three internal knots that are placed at the 10th, 75th, and 90th percentiles of the daily temperature distribution within the observational period. We also use a natural cubic B-spline for the lag-response curve with an intercept and three internal knots placed at evenly spaced intervals on the log scale. A maximum of \(L=21\) days is considered to account for delayed effects of both cold and hot temperatures.
3.1 Parameters
# Segmentation parameters
list_sexe <- list(m = 1, f = 2)
age_breaks <- c(0, 64, 74, 84, Inf)
age_labels <- c("0-64", "65-74", "75-84", "85+")
# Training period
annee_deb <- 1980
annee_fin <- 2019
# DLNM Parameters
param_dlnm_whole <- list(
# main model, cubic natural spline with three internal knots in
# the 10th, 75th, 90th percentiles of the temperature distribution
varfun = "bs",
vardegree = 2,
varper = c(10, 75, 90),
## specification of the lag function
# Definition of the maximum lag, that is, 21 days
lag = 21,
lagnk = 3,
## degree of freedom for seasonality
dfseas = 8,
## degree of freedom for trend
dftrend = NULL
)3.2 Model estimation
We fit the model for each sex and age bucket.
3.3 Results and diagnostics
Based on Antonio Gasparrini and Leone (2014), we calculate the temperature-attributable fraction for day \(d\) as \[ \widehat{\text{AF}}^{(g)}_{k,t,d} = 1-\exp{\left(- \sum_{l=0}^{L}{f\cdot w(\vartheta_{d,l}, l;\widehat{\boldsymbol{\theta}}_k^{(g)})} \right)}. \] Then, we deduce the death counts attributable to temperatures for each day of the year \(d\), and age \(x \in X_k\), \(k \in \left\lbrace1, \ldots, K\right\rbrace\), using the formula \[ \widehat{\bar{D}}_{x,t,d}^{(g)} = \widehat{\text{AF}}^{(g)}_{k,t,d} \times \sum_{l=0}^{L} \frac{D_{x,t,d+l}^{(g)}}{L+1}. \] The term \(\exp{\left(s(\vartheta_{d,t}, L; \widehat{\boldsymbol{\theta}}_k^{(g)} )\right)}\) corresponds to the cumulative RR of temperature mortality. We display below the cumulative RR curves for temperatures.
For analyzing the goodness of fit performance of our models, we compute the estimated daily death counts for each day \(d\) of year \(t\), which is given by \[ \widehat{D}_{k,t,d}^{(g)} = \exp{\left(s(\vartheta_{d,t}, L; \widehat{\boldsymbol{\theta}}_k^{(g)} )\right)} \exp{\left(\widehat{\eta}_k^{(g)} + z_{d,1} \widehat{\zeta}_{k,1}^{(g)} + \sum_{t \in \mathcal{T}_y}{h_t( z_{t,p} ; \widehat{\boldsymbol{\zeta}}_{k,t}^{(g)})}\right)}. \] Based on these predicted counts, we display the residuals of the Deviance by year and age group, and compare the distribution of observed (in blue) and predicted (in green) death counts for each month, distinguishing by sex and decade for females and males.
# in-sample performance of the fitted model for the whole year.
perf_mort <- lapply(names(daily_data), function(x){
perf_dnlm(fit[[x]])
})
names(perf_mort) <- names(daily_data)3.3.1 Females
3.3.2 Males
3.4 Estimation of the temperature- attributable fractions
The model’s setup allows for a range of predictions that precisely gauge the influence of heat, cold, or specific events occurring on each day of subset \(\mathcal{D}_t \subseteq \mathcal{D}^\star\) of year \(t\).
The sum of the contributions from each day of this subperiod \(\mathcal{D}_t\) for a year \(t \in \mathcal{T}_y\) is as follow \[ \widehat{\bar{D}}_{x,t}^{(g)} = \sum_{d \in \mathcal{D}_t}{\widehat{\bar{D}}_{x,t,d}^{(g)}\textbf{1}_{\left\lbrace d \in \mathcal{D}_t\right\rbrace }}. \]
Thus, the total attributable fraction is estimated for each age \(x \in X_k\) and year \(t \in \mathcal{T}_y\) as \[ \widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t)= \dfrac{\widehat{\bar{D}}_{x,t}^{(g)}}{\sum_{d \in \mathcal{D}_t} D_{x,t,d}^{(g)}}, \]
We compute the temperature-attributable fractions \(\widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t)\) for both women and men, and aggregate them over all ages to facilitate visualization on the figure below. The estimation error is calculated based on 1000 replications of parametric bootstraps.
# Run excess of mortality
xs_mort <- lapply(names(daily_data), function(x){
res <- excess_mort_dlnm(daily_data[[x]], fit[[x]], q_range = c(q025, q975),
nsim = nsim, parallel = parallel, ncpus = ncpus)
return(res)
})
names(xs_mort) <- names(daily_data)
# arrange data
xs_data <- lapply(names(xs_mort), function(i){
xs_mort[[i]]$excess_mort %>%
dplyr::mutate(gender = i) %>%
rename(year = years)
})
xs_data <- do.call("rbind", xs_data)
# print attributable fractions
summary_temp_effect(xs_data)4 The Li-Lee model
We decomposition the number of deaths and death rates into two components \[ D_{x,t}^{(g)} = \widetilde{D}_{x,t}^{(g)} + \bar{D}_{x,t}^{(g)} \Rightarrow \widehat{m}_{x,t}^{(g)} = \widetilde{m}_{x,t}^{(g)} + \bar{m}_{x,t}^{(g)}, \] where \(\bar{D}_{x,t}^{(g)}\) and \(\widetilde{D}_{x,t}^{(g)}\) are respectively the number of deaths attributable and non attributable to temperature effects.
Then, we consider the two-populations Li and Lee (2005) model for the crude
central death rates non attributable to temperature effects as
\[
\ln \left(\widetilde{m}_{x,t}^{\left(g\right)} \right)= A_x + B_{x}K_{t} + \alpha_{x}^{\left( g \right)} + \beta_{x}^{\left( g \right)} \kappa_{t}^{\left( g \right)}.
\]
We use the following specifications:
- Identifiability constraints
\[ \sum\limits_{t \in \mathcal{T}_y} K_{t} = 0 \text{ and } \sum\limits_{x \in \mathcal{X}} B_{x}^2 = 1, \] \[ \sum\limits_{t \in \mathcal{T}_y} \kappa_{t}^{\left( g \right)} = 0 \text{ and } \sum\limits_{x \in \mathcal{X}} (\beta_{x}^{\left( g \right)})^2 = 1, \text{ for } g \in \mathcal{G} \]
- Time series model with coherence assumption
\[ K_{t} = \delta + K_{t-1} + e_{t} \;\; (\text{RWD with drift}) \] \[ \kappa_{t}^{\left( g \right)} = c^{\left( g \right)} + \phi^{\left( g \right)} \kappa_{t-1}^{\left( g \right)} + r_{t}^{\left( g \right)} \;\; (\text{AR(1) with drift and } \vert \phi^{(g)} \vert <1). \] Errors terms are white noises with a mean of zero and a variance-covariance matrix \(\boldsymbol{\Sigma}\).
4.1 Fitting the Poisson regression model
For calibrating the parameters \(({A}_x, {B}_x, {K}_{t}, {\alpha}_{x}^{(f)}, {\beta}_{x}^{(f)}, {\kappa}_{t}^{(f)}, {\alpha}_{x}^{(m)}, {\beta}_{x}^{(m)}, {\kappa}_{t}^{(m)})\), we use a Poisson assumption for the number of virtual deaths as in Brouhns, Denuit, and Vermunt (2002)
\[
\widetilde{D}_{x,t}^{(g)} \sim \text{Pois}\left( E_{x,t}^{(g)} \widetilde{m}_{x,t}^{(g)}\right).
\]
Thus, with the attributable fraction \(\text{AF}^{(g)}_{x,t}\), we also have a
Poisson formulation with log-link function for the overall number of deaths
\[
{D}_{x,t}^{(g)} \sim \text{Pois}\left( E_{x,t}^{(g)} T_{x,t}^{(g)} \widetilde{m}_{x,t}^{(g)}\right),
\]
where \(T_{x,t}^ {(g)} = \left( 1 - \text{AF}^{(g)}_{x,t}\right)^{-1}\)
can be consider into the offset term.
4.1.1 Calculating the offset term based on the attributable fraction
First, we define the mortality model hyper-parameters, i.e. the age range, the observation period, and the name of each group.
Then, we select all_effect (all the effects of temperatures over the year)
and assume that the offset term \(T_{x,t}^{(g)}\) is constant for all ages
in each bucket.
# Data for women and men
xs_data <- get(load(file = paste0(fold_data, "/xs_data.RData")))
xs_f <- xs_data %>%
dplyr::filter(gender == "f", temp_effect == "all_effect") %>%
dplyr::select(year, age_bk, ajust_factor)
xs_m <- xs_data %>%
dplyr::filter(gender == "m", temp_effect == "all_effect") %>%
dplyr::select(year, age_bk, ajust_factor)
# Transformation to matrix
transf_attrib_matrix <- function(df){
mat <- matrix(data = 1, ncol = length(xv), nrow = length(yv))
for(tt in yv){
mat[tt - yv[1] + 1, 1:65] <- dplyr::filter(df, year == tt,
age_bk == "0-64")$ajust_factor
mat[tt - yv[1] + 1, 66:75] <- dplyr::filter(df, year == tt,
age_bk == "65-74")$ajust_factor
mat[tt - yv[1] + 1, 76:85] <- dplyr::filter(df, year == tt,
age_bk == "75-84")$ajust_factor
mat[tt - yv[1] + 1, 86:(max(xv) + 1)] <- dplyr::filter(
df, year == tt, age_bk == "85+")$ajust_factor
}
dimnames(mat) <- list(yv, xv)
return(mat)
}
xs_f <- transf_attrib_matrix(xs_f)
xs_m <- transf_attrib_matrix(xs_m)4.1.2 Adjusting exposure to risk
Now, we multiply the offset term \(T_{x,t}^{(g)}\) by the exposure to risk. Subsequently, we construct a new dataset that we use to fit the Li-Lee model.
# Create datasets for deaths with ajusted exposure
adj_mort_dt <- list(
UNI = list(
f = list(dtx = round(mort_dt$UNI$f$dtx),
etx = mort_dt$UNI$f$etx * xs_f,
wa = mort_dt$UNI$f$wa),
m = list(dtx = round(mort_dt$UNI$m$dtx),
etx = mort_dt$UNI$m$etx * xs_m,
wa = mort_dt$UNI$m$wa)),
ALL = list(dtx = mort_dt$ALL$dtx,
etx = mort_dt$ALL$etx,
wa = mort_dt$ALL$wa)
)
## replace exposure to adjusted exposure
adj_mort_dt$ALL$etx <- adj_mort_dt$UNI$f$etx + adj_mort_dt$UNI$m$etx4.1.3 Model calibration
We calibrate the parameters \(({A}_x, {B}_x, {K}_{t}, {\alpha}_{x}^{(f)}, {\beta}_{x}^{(f)}, {\kappa}_{t}^{(f)}, {\alpha}_{x}^{(m)}, {\beta}_{x}^{(m)}, {\kappa}_{t}^{(m)})\) of the Li-Lee model using both the original and adjusted exposure to risk. This allows to assess the impact of temperature-attributable deaths on these estimated parameters.
These models are estimated by maximizing the Poisson log-likelihood. As
detailed in the paper, our procedure consists in two steps. First, we estimate
the common parameters \(({A}_x, {B}_x, {K}_{t})\), followed by the estimation
of sex-specific parameters. This estimation is performed
using the R-package MultiMoMo (Antonio, Devriendt, and Robben 2022).
# Models with the original exposure to risk
fit_M <- fit_li_lee(xv, yv, yv, "m", data = mort_dt, method = "NR",
Ax = TRUE, exclude_coh = FALSE)
fit_F <- fit_li_lee(xv, yv, yv, "f", data = mort_dt, method = "NR",
Ax = TRUE, exclude_coh = FALSE)# Models with the adjusted exposure to risk
adj_fit_M <- fit_li_lee(xv, yv, yv, "m", data = adj_mort_dt, method = "NR",
Ax = TRUE, exclude_coh = FALSE)
adj_fit_F <- fit_li_lee(xv, yv, yv, "f", data = adj_mort_dt, method = "NR",
Ax = TRUE, exclude_coh = FALSE)The calibrated parameters are displayed as follow.
list_fit <- list(fit_F = fit_F,
fit_M = fit_M,
adj_fit_F = adj_fit_F,
adj_fit_M = adj_fit_M)
# print figures
myplot_parameters_li_lee(xv, yv, list_fit)4.1.4 Goodness of fit
We now display residuals for the model with exposures adjusted of temperature effects (model with original exposure provides similar results).
fig_res_f <- plot_residuals_li_lee(xv, yv, adj_fit_F, "FR", "Female")
fig_res_m <- plot_residuals_li_lee(xv, yv, adj_fit_M, "FR", "Male")
plot_grid(fig_res_f + ggtitle("Females"),
fig_res_m + ggtitle('Males'),
label_size = 12)4.2 Calibrating and forecasting the time series model
In this section, we focus on calibrating the time series model, rewritten in matrix form as \[ \boldsymbol{Y}_{t} = \boldsymbol{\Upsilon}+ \boldsymbol{\Phi}\boldsymbol{Y}_{t-1} + \boldsymbol{E}_t, \] where \[ \boldsymbol{Y}_{t} = \begin{pmatrix} K_t \\ \kappa_t^{(f)} \\ \kappa_t^{(m)} \end{pmatrix}, \boldsymbol{\Upsilon}= \begin{pmatrix} \delta \\ c^{(f)} \\ c^{(m)} \end{pmatrix}, \boldsymbol{\Phi}= \begin{pmatrix} 1 & 0 & 0\\ 0 & \phi^{(f)} & 0\\ 0 & 0 & \phi^{(m)}\\ \end{pmatrix} \text{ and } \boldsymbol{E}_t = \begin{pmatrix} e_t \\ r_t^{(f)} \\ r_t^{(m)} \end{pmatrix}. \]
We calibrate this model using two sets of parameters for \(\boldsymbol{Y}_{t}\), i.e.
the parameters obtained with the original exposure to risk and those obtained
with the temperauture-adjusted exposure to risk. These
parameters are also estimated through maximum likelihood using R-package
MultiMoMo.
4.2.1 Specification
We consider the projection period from the last year of the calibration set to \(H\), the projection horizon.
4.2.2 Calibration and projection
In this section, we estimate the parameters \(\boldsymbol{\Upsilon}\), \(\widehat{\Phi}\), and \(\widehat{\boldsymbol{\Sigma}}\). Then, we forecast \(\boldsymbol{Y}_t\) by simulating the innovation errors \(\boldsymbol{E}_t\) from a multivariate Gaussian vector with zero mean and covariance matrix \(\widehat{\boldsymbol{\Sigma}}\). In this application, 1000 Monte Carlo simulations is considered.
set.seed(1234)
# Model with original exposures
proj_par <- project_parameters(fit_M, fit_F, n_ahead, n_sim,
arima_spec, est_method)
# Model with adjusted exposures
adj_proj_par <- project_parameters(adj_fit_M, adj_fit_F, n_ahead, n_sim,
arima_spec, est_method)We verify that the series modeled by an AR(1) process are stationary, i.e. parameter estimates \(\widehat{k}_t^{(g)}\) are smaller than 1.
tab_param <- data.frame(
v1 = c("Original Li-Lee model", "Adjusted exposure to risk"),
v2 = c(proj_par$coef_KtM, adj_proj_par$coef_KtM),
v3 = c(proj_par$coef_ktM[1], adj_proj_par$coef_ktM[1]),
v4 = c(proj_par$coef_ktM[2], adj_proj_par$coef_ktM[2]),
v5 = c(proj_par$coef_ktF[1], adj_proj_par$coef_ktF[1]),
v6 = c(proj_par$coef_ktF[2], adj_proj_par$coef_ktF[2])
)
# Print table
tab_param %>%
kbl( booktabs = T,
align='c',
col.names = c("Calibration data", "$\\delta$",
"$c^{(m)}$", "$\\phi^{(m)}$",
"$c^{(f)}$", "$\\phi^{(f)}$"),
digits=4,
escape = FALSE) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = "hold_position", font_size = 8)| Calibration data | \(\delta\) | \(c^{(m)}\) | \(\phi^{(m)}\) | \(c^{(f)}\) | \(\phi^{(f)}\) |
|---|---|---|---|---|---|
| Original Li-Lee model | -0.2340 | -0.0066 | 0.9692 | -0.0266 | 0.9963 |
| Adjusted exposure to risk | -0.2341 | -0.0045 | 0.9697 | -0.0256 | 0.9999 |
The estimated covariance matrix \(\boldsymbol{C}\) is given as follow.
# Create correction matrix
corr_function <- function(S)
{
D <- diag(sqrt(diag(S)))
Dinv <- solve(D)
R <- Dinv %*% S %*% Dinv
dimnames(R) <- dimnames(S)
R
}
# Show correction matrix in table
tab_corr <- cbind(data.frame(corr_function(proj_par$cov_mat)),
data.frame(corr_function(adj_proj_par$cov_mat)))
names(tab_corr) <- paste0("v", 1:ncol(tab_corr))
row.names(tab_corr) <- NULL
# Print table
tab_corr %>%
dplyr::mutate(v = c("$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$"),
.before = "v1") %>%
kbl(booktabs = T,
align='c',
col.names = c("", "$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$",
"$e_t$", "$r_t^{(m)}$", "$r_t^{(f)}$"),
digits=4,
escape = FALSE) %>%
add_header_above(c(" " = 1, "Original Li-Lee model" = 3,
"Adjusted exposure to risk" = 3)) %>%
kable_classic_2(full_width = F) %>%
kable_styling(latex_options = "hold_position", font_size = 8)| \(e_t\) | \(r_t^{(m)}\) | \(r_t^{(f)}\) | \(e_t\) | \(r_t^{(m)}\) | \(r_t^{(f)}\) | |
|---|---|---|---|---|---|---|
| \(e_t\) | 1.0000 | -0.7393 | 0.9449 | 1.0000 | -0.5648 | 0.9330 |
| \(r_t^{(m)}\) | -0.7393 | 1.0000 | -0.7152 | -0.5648 | 1.0000 | -0.5447 |
| \(r_t^{(f)}\) | 0.9449 | -0.7152 | 1.0000 | 0.9330 | -0.5447 | 1.0000 |
4.2.3 Projecting parameters
We present the projection of parameters \(\widehat{K}_t\), \(\widehat{\kappa}_t^{(f)}\), and \(\widehat{\kappa}_t^{(m)}\) over the period 2020-2100 with the \(95\%\) prediction intervals. These projections are shown for both Li-Lee models with and without temperature effects.
5 Multi-mortality and climate models
Now that death rates not attributable to temperature effects can be projected, we focus on adding mortality shocks resulting from daily temperature variations. To achieve this, we consider a temperature trajectory derived from an external climate model \((\vartheta_{d,t})\), and then project the central death rates with temperature effects accumulated over period \(\mathcal{D}_t\) as \[ \widehat{m}_{x,t}^{(g)} = {\widetilde{m}}_{x,t}^{(g)} \left[ 1 + \sum_{d \in \mathcal{D}_t}{ \omega_{x,t,d}^{(g)} \text{AF}_{x,d,t}^{(g)} (1- \text{AF}_{x,d,t}^{(g)})^{-1}\boldsymbol{1}_{\left\lbrace d \in \mathcal{D}_t \right\rbrace }}\right]. \] where \(\omega_{x,t,d}^{(g)} =\widetilde{D}_{x,t,d}^{(g)}/\widetilde{D}_{x,t}^{(g)}\) corresponds to the weight associated with the distribution of death counts not attributable to temperatures over the period \(\mathcal{D}_t\) of year \(t\).
5.1 Projecting future mortality
Now, we simulate the projected all-cause death rates and the projected all-cause death rates not attributable to temperatures. \[ \widehat{q}_{x,t}^{(g)} = 1 - \exp{\left(-\widehat{\widehat{m}}_{x,t}^{(g)}\right)}, \quad \widehat{\widetilde{q}}_{x,t}^{(g)} = 1 - \exp{\left(\widehat{\widetilde{m}}_{x,t}^{(g)}\right)}, \]
For each simulation, we also compute the number of years of life expectancy lost (or gained) due to temperatures \[ \Delta \widehat{e}_{x,t}^{(g)} = \sum_{k=1}^{t_{\max}}{\left[ \prod_{j=0}^{k-1}{\left(1-\widehat{\widetilde{q}}_{x,j}^{(g)}\right)} - \prod_{j=0}^{k-1}{\left(1-\widehat{q}_{x,j}^{(g)}\right)} \right]}. \]
The first step is to simulate the death rates \(\hat{m}_{x,t}^{(g)}\) and \(\widetilde{m}_{x,t}^{(g)}\) derived respectively from the original Li-Lee model and the model with adjusted exposure to risk.
## project mortality rates considering historical temperature effect
proj_rates <- project_mortality_rates(fit_M, fit_F, proj_par)
## project mortality rates considering without temperature effect
adj_proj_rates <- project_mortality_rates(adj_fit_M, adj_fit_F, adj_proj_par)
# reformat data
format_rates <- function(rates)
{
rates <- melt(rates)
names(rates) <- c("age", "year", "sim", "Qxt")
rates <- rates %>%
mutate(age_bk = cut(age, age_breaks, age_labels, include.lowest = T))
}
proj_rates <- lapply(proj_rates, format_rates)
names(proj_rates) <- c("m", "f")
adj_proj_rates <- lapply(adj_proj_rates, format_rates)
names(adj_proj_rates) <- c("m", "f")Second, we compute the weight associated with the distribution of death counts not attributable to temperatures over the period \(\mathcal{D}_t\) of year \(t\) \[ \omega_{x,t,d}^{(g)} =\widetilde{D}_{x,t,d}^{(g)}/\widetilde{D}_{x,t}^{(g)}. \]
We compute this weight as an average over the last 5 years.
# Run excess of mortality
xs_mort <- lapply(names(daily_data), function(x){
res <- excess_mort_dlnm(daily_data[[x]], fit[[x]], q_range = c(q025, q975),
nsim = 1, parallel = "no", ncpus = ncpus)
return(res)
})
names(xs_mort) <- names(daily_data)
# Compute weights
un_an_weight <- lapply(names(xs_mort), function(x){
# Add day and months
# Filter on the last five years
res <- xs_mort[[x]]$excess_mort %>%
mutate(day = day(datedec),
month = month(datedec)) %>%
filter(years %in% c(2015:2019), temp_effect == "all_effect") %>%
mutate(un_an = cases - an) %>%
dplyr::select(years, age_bk, un_an, day, month) %>%
group_by(age_bk, day, month) %>%
summarise(un_an = sum(un_an))
# Calculate the denominator of the weight
xs_mort_agg <- res %>%
group_by(age_bk) %>%
summarise(un_an_sum = sum(un_an))
res <- res %>%
left_join(xs_mort_agg, by = "age_bk") %>%
mutate(weight = un_an / un_an_sum) %>%
mutate(un_an = NULL,
un_an_sum = NULL)
# Correct 29-02
res$weight[res$day==29 & res$month==2] <- 5 *
res$weight[res$day==29 & res$month==2]
return(res)
})
names(un_an_weight) <- names(xs_mort)Next, we apply the projected additional mortality related to future temperature. Estimation errors related to the DLMN model is considered with 1000 bootstrap simulations.
It is worth noting that it is also possible to simultaneously consider temperature simulations from multiple climate models to capture this source of uncertainty. However, we do not include this step in this Notebook to simplify the presentation. Implementing this approach is not difficult and is been discussed in the paper.
# Change names dates
clim_model <- clim_model %>% dplyr::rename(datedec = date)
# Select the forecasting period
sc <- dplyr::filter(clim_model, years %in% start_year:end_year)
start_time <- Sys.time()
# Run simulations for males and females
res <- lapply(names(adj_proj_rates), function(x)
{
# Select population
temp_dem <- adj_proj_rates[[x]] %>%
filter(year %in% start_year:end_year)
# select weight
temp_weight <- un_an_weight[[x]]
# Launch forecasting along the rcp scenario with simulations
# and parallelization
f_mort <- forecast_mort_dlnm(temp_dem, sc, fit[[x]], temp_weight,
q_range = c(q025, q975),
sensi_cen = 0,
nsim = nsim, parallel = parallel,
ncpus = ncpus)
# Add sex indexes
f_mort <- lapply(f_mort, function(tt){
if(is.null(tt))
{
return(tt)
} else{
return(
tt %>%
dplyr::mutate(gender = x) %>%
relocate(gender, temp_effect, .before = sim)
)
}
})
return(f_mort)
})
# Merge more than two lists with the same element name
res <- Reduce(function(...) Map("rbind", ...), res)
# Save and printtime after operation
end_time <- Sys.time()
time_diff <- end_time - start_time
print(paste("Run time:", as.numeric(time_diff, units = "mins"), "mins"))## [1] "Run time: 591.297907698154 mins"
5.2 Dislaying attributable fraction
In the following figures, we display the projected attributable fraction related to future temperatures in the selected climate scenario.
To facilitate visual analysis, we calculate an aggregated attributable fraction to temperatures using the distribution of deaths known at the end of 2019, as follows \[ \widehat{\text{AF}}_{t} (\mathcal{D}_t) = \sum_{g \in \mathcal{G}} { \sum_{x \in \mathcal{X}}{\widehat{\text{AF}}_{x,t}^{(g)} (\mathcal{D}_t) \frac{D_{x, 2019}^{(g)}}{D_{2019}} }}, \]
where \(D_{2019} = \sum_{g \in \mathcal{G}}{\sum_{x \in \mathcal{X}}{D_{x, 2019}^{(g)}}}.\)
# extract the last year 2019, for calculate exposure to risk
expo <- do.call("rbind", lapply(mort_dt$UNI, function(dt){
dt <- dt$dtx[nrow(dt$dtx), ]
return(data.frame(age = 0:105,
w = dt,
age_bk = c(rep("0-64", 65), rep("65-74", 10),
rep("75-84", 10), rep("85+", 21))
))
}))
expo$gender <- c(rep("f", 106), rep("m", 106))
# Agregate expo per age bucket
W <- sum(expo$w)
expo <- expo %>%
group_by(age_bk, gender) %>%
summarise(w = sum(w) / W)# Specify the RCP number
res$tab_excess$traj_clim <- "ALADIN63_CNRM-CM5"
res$tab_excess$rcp <- "8.5"
# Weighted attributed fraction
plot_aff <- summary_forecast_temp_effect(res$tab_excess, expo, var = "af_weight_year")
# Attributed fraction with constant weight
plot_aff_no_weight <- summary_forecast_temp_effect(res$tab_excess, expo, var = "af_no_weight_year")
# Plot aggregate temperature attributed fraction (with weights)
plot_aff[[1]]p.comb <- plot_aff[[2]]
# Plot temperature attributed fraction for each age group (with weights)
p.comb[[1]]5.3 Assessing the loss in life expectancy
Finally, we assess the loss in life expectancy due to future temperatures.
res$tab_ex$traj_clim <- "ALADIN63_CNRM-CM5"
res$tab_ex$rcp <- "8.5"
plot_ev <- summary_forecast_ev_effect(res$tab_ex)
# Plot LE including temperature effects (with weights)
plot_ev[[1]]6 Conclusion
This Notebook illustrates the implementation of a multi-population mortality model incorporating the effect of temperature changes on mortality. The construction of this framework is simplified by omitting multiple scenarios and various climate models. Therefore, it does not exactly reproduce the results presented in the paper.